Abstract
Contemporary risk stratification algorithms commonly use threshold-defined categories of clinically relevant risk factors. The Children's Oncology Group (COG) uses National Cancer Institute (NCI) risk group (RG), cytogenetics, and early response to therapy measured by minimal residual disease (MRD) using flow cytometry on day 8 peripheral blood (D8 PB) and day 29 bone marrow (D29 BM). However, it is unclear whether assigning different weights to individual risk factors, as well as using numerical values as continuous, rather than categorical, would more accurately predict relapse risk. Previous work (Loh, ASH 2020) described validation of a continuous prognostic index (PI) for risk of relapse published by UK investigators incorporating favorable and unfavorable genetics, white blood cell count (WBC), and D29 BM as continuous variables (O'Connor, JCO 2018; Enshaei, Blood 2020), and assessed the added value of D8 PB. We now extend this work by comparing patient outcomes with current COG risk classification to PI-derived risk classifications on the previously described population (Loh, ASH 2020).
We first retrospectively classified patients (pts) (N=21,199 from prior COG trials AALL0331/0232 or AALL0932/1131 enrolled 2004-2019) in our analysis population using the COG risk stratification algorithm employed in the current generation of COG trials. Pts with Down syndrome or BCR/ABL1 were excluded. We classified our analysis population as SR-Favorable [SR-Fav, 24.5% (5-year relapse free survival (RFS) probability 0.97)], SR-Favorable/Average (not distinguishable because of missing D8 PB) [SR-Fav/Avg, 5.3% (.96)], SR-Avg [20.5% (0.93)], SR-High [12.5% (0.83)], HR-Fav [3.0% (0.96)], HR [29.6% (0.82)], and Very HR [VHR, 1.1% (0.54)] according to NCI RG, CNS status, cytogenetics, D8 PB where relevant, D29 BM, and EOC MRD. Ninety-seven percent of pts had sufficient data to be retrospectively classified and thus 20,176 pts were considered for subsequent analyses.
We next developed a multivariable model for RFS using log transformed MRD (τ(MRD)). Temporal external validation was first employed by developing models considering AALL0932/1131 data (n=12,453) and then validating them with AALL0331/0232 data (n=7,723). Of the full cohort of 20,176 pts, 24.4% could not be classified by COG PI, primarily due to missing D8 PB MRD which was not assessed routinely in earlier studies; thus the model was developed on 11,151 pts and validated on 4,103 pts. The COG PI (PI COG) was calculated using the equation [τ(d8 MRD) x -0.036 + τ(d29 MRD) x -0.119 + CYTO-GR x -0.914 + CYTO-HR x 0.752 + WBC log x 0.178]. The UK PI (PI UK) was also calculated using published coefficients [τ(d29 MRD) x -0.218 + CYTO-GR x -0.440 + CYTO-HR x 1.066 + WBC log x 0.138] for comparison to assess the practical significance of adding D8 PB. In contrast to the UK method, we identified risk groups by selecting PI cutoffs that maximized the discrimination of the predictive model as quantified by the concordance probability estimator (CPE) (Barrio, SORT 2017). This objective method of cutpoint determination allows for risk group definition without investigator agreement on exact prespecified risk group characteristics; this method also defined four risk groups (Low, Standard, Intermediate, and High).
Cutpoints derived from the two different indices, applied to the pts who could be classified by PI COG (n=15,254), resulted in different proportions of pts in each of the risk groups with generally similar RFS estimates for each group. Using cutpoints estimated for PI COG (-2.073, -1.307, and -0.857) 36.0% (RFS = 0.97) were classified as low, 29.6% (0.93) standard, 17.1% (0.88) intermediate, and 17.4% (0.73) high risk of relapse. For PI UK ( -2.916, -2.534, and -1.15), among those who were classifiable by PI COG, 33.4% (0.97) were classified as low, 26.3% (0.93) standard, 30% (0.87) intermediate, and 10.4% (0.69) high.
Finally, we compared the COG risk stratification to PI CPE-defined risk stratification in the cohort. As shown in the table, PI COG improves discrimination among individuals by identifying groups with different relapse risk than expected. The PI COG can thus identify patients for whom therapeutic intensification may not result in significantly better outcomes while improving the discrimination of HR pts to allow randomized interventions with achievable hazard ratios.
Loh: MediSix therapeutics: Membership on an entity's Board of Directors or advisory committees. Borowitz: Amgen, Blueprint Medicines: Honoraria. Zweidler-McKay: ImmunoGen: Current Employment. Mullighan: AbbVie: Research Funding; Amgen: Current equity holder in publicly-traded company; Illumina: Membership on an entity's Board of Directors or advisory committees; Pfizer: Research Funding. Hunger: Amgen: Current equity holder in publicly-traded company. Raetz: Pfizer: Research Funding; Celgene: Other: DSMB member.